Treatment of Multiword Expressions and Compounds in Bulgarian

نویسندگان

  • Petya Osenova
  • Kiril Simov
چکیده

The paper shows that catena representation together with valence information can provide a good way of encoding Multiword Expressions (beyond idioms). It also discusses a strategy for mapping noun/verb compounds with their counterpart syntactic phrases. The data on Multiword Expression comes from BulTreeBank, while the data on compounds comes from a morphological dictionary of Bulgarian.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Far Reach of Multiword Expressions in Educational Technology

Multiword expressions as they appear as nominal compounds, collocational forms, and idioms are now leveraged in educational technology in assessment and instruction contexts. The talk will focus on how multiword expression identification is used in different kinds of educational applications, including automated essay evaluation, and teacher professional development in curriculum development fo...

متن کامل

Domain-Dependent Identification of Multiword Expressions

The identification of different kinds of multiword expressions require different solutions, on the other hand, there might be domain-related differences in their frequency and typology. In this paper, we show how our methods developed for identifying noun compounds and light verb constructions can be adapted to different domains and different types of texts. Our results indicate that with littl...

متن کامل

Using Distributional Similarity of Multi-way Translations to Predict Multiword Expression Compositionality

We predict the compositionality of multiword expressions using distributional similarity between each component word and the overall expression, based on translations into multiple languages. We evaluate the method over English noun compounds, English verb particle constructions and German noun compounds. We show that the estimation of compositionality is improved when using translations into m...

متن کامل

Distinguishing Subtypes of Multiword Expressions Using Linguistically-Motivated Statistical Measures

We identify several classes of multiword expressions that each require a different encoding in a (computational) lexicon, as well as a different treatment within a computational system. We examine linguistic properties pertaining to the degree of semantic idiosyncrasy of these classes of expressions. Accordingly, we propose statistical measures to quantify each property, and use the measures to...

متن کامل

Automatic acquisition of “noun+verb” idiomatic compounds in Korean*14

Song, Sanghoun. 2015. Automatic acquisition of “noun+verb” idiomatic compounds in Korean. Linguistic Research 32(1), 253-280. The state-of-the-art skills of computational linguistics pay attention to lexical semantics, because it has a potential to be used to improve language processing systems in terms of coverage as well as accuracy. In particular, utilizing multiword expressions is important...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014